Dataset statistics
| Number of variables | 25 |
|---|---|
| Number of observations | 3953 |
| Missing cells | 1047 |
| Missing cells (%) | 1.1% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 772.2 KiB |
| Average record size in memory | 200.0 B |
Variable types
| CAT | 14 |
|---|---|
| NUM | 10 |
| BOOL | 1 |
Reproduction
| Analysis started | 2020-05-26 08:15:51.125456 |
|---|---|
| Analysis finished | 2020-05-26 08:16:15.789395 |
| Duration | 24.66 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
Name has a high cardinality: 3682 distinct values | High cardinality |
Email ID has a high cardinality: 3373 distinct values | High cardinality |
University has a high cardinality: 3140 distinct values | High cardinality |
Zip Code has a high cardinality: 615 distinct values | High cardinality |
Funded amnt inv is highly correlated with Loan Amnt and 1 other fields | High correlation |
Loan Amnt is highly correlated with Funded amnt inv and 1 other fields | High correlation |
INSTALLMENT is highly correlated with Loan Amnt and 1 other fields | High correlation |
Sub Grade is highly correlated with GRADE | High correlation |
GRADE is highly correlated with Sub Grade | High correlation |
Name has 271 (6.9%) missing values | Missing |
Email ID has 580 (14.7%) missing values | Missing |
Gender has 78 (2.0%) missing values | Missing |
University has 118 (3.0%) missing values | Missing |
Name is uniformly distributed | Uniform |
Email ID is uniformly distributed | Uniform |
University is uniformly distributed | Uniform |
Dt_Applied has unique values | Unique |
Delinq 2Yrs has 3628 (91.8%) zeros | Zeros |
Inq Last 6Mths has 1822 (46.1%) zeros | Zeros |
Revol Bal has 42 (1.1%) zeros | Zeros |
| Distinct count | 3682 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 271 |
| Missing (%) | 6.9% |
| Memory size | 30.9 KiB |
| Rollie Leathley | 1 |
|---|---|
| Issi Woodhead | 1 |
| Ezequiel Canty | 1 |
| Doloritas Adamski | 1 |
| Hadleigh Pleasance | 1 |
| Other values (3677) |
| Value | Count | Frequency (%) | |
| Rollie Leathley | 1 | < 0.1% | |
| Issi Woodhead | 1 | < 0.1% | |
| Ezequiel Canty | 1 | < 0.1% | |
| Doloritas Adamski | 1 | < 0.1% | |
| Hadleigh Pleasance | 1 | < 0.1% | |
| Lowell Bleaden | 1 | < 0.1% | |
| Una Eagger | 1 | < 0.1% | |
| Leanora Neeve | 1 | < 0.1% | |
| Olivia Batch | 1 | < 0.1% | |
| Giavani Swyre | 1 | < 0.1% | |
| Other values (3672) | 3672 | 92.9% | |
| (Missing) | 271 | 6.9% |
Length
| Max length | 23 |
|---|---|
| Median length | 14 |
| Mean length | 13.27649886 |
| Min length | 3 |
| Distinct count | 3373 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 580 |
| Missing (%) | 14.7% |
| Memory size | 30.9 KiB |
| tguymeret@icq.com | 1 |
|---|---|
| mlobleyqp@loc.gov | 1 |
| blenevegh@paginegialle.it | 1 |
| jspringettic@google.com | 1 |
| jcrandonkm@craigslist.org | 1 |
| Other values (3368) |
| Value | Count | Frequency (%) | |
| tguymeret@icq.com | 1 | < 0.1% | |
| mlobleyqp@loc.gov | 1 | < 0.1% | |
| blenevegh@paginegialle.it | 1 | < 0.1% | |
| jspringettic@google.com | 1 | < 0.1% | |
| jcrandonkm@craigslist.org | 1 | < 0.1% | |
| gdaguanno68@indiatimes.com | 1 | < 0.1% | |
| emccoyg6@unc.edu | 1 | < 0.1% | |
| awhitakergw@about.me | 1 | < 0.1% | |
| csharplingcz@globo.com | 1 | < 0.1% | |
| hchastelain8p@storify.com | 1 | < 0.1% | |
| Other values (3363) | 3363 | 85.1% | |
| (Missing) | 580 | 14.7% |
Length
| Max length | 35 |
|---|---|
| Median length | 21 |
| Mean length | 19.06982039 |
| Min length | 3 |
| Distinct count | 2 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 78 |
| Missing (%) | 2.0% |
| Memory size | 30.9 KiB |
| Male | |
|---|---|
| Female |
| Value | Count | Frequency (%) | |
| Male | 1970 | 49.8% | |
| Female | 1905 | 48.2% | |
| (Missing) | 78 | 2.0% |
Length
| Max length | 6 |
|---|---|
| Median length | 4 |
| Mean length | 4.944093094 |
| Min length | 3 |
| Distinct count | 3953 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 30.9 KiB |
| 16/08/88 | 1 |
|---|---|
| 04/08/86 | 1 |
| 12/07/83 | 1 |
| 26/10/86 | 1 |
| 17/11/85 | 1 |
| Other values (3948) |
| Value | Count | Frequency (%) | |
| 16/08/88 | 1 | < 0.1% | |
| 04/08/86 | 1 | < 0.1% | |
| 12/07/83 | 1 | < 0.1% | |
| 26/10/86 | 1 | < 0.1% | |
| 17/11/85 | 1 | < 0.1% | |
| 05/07/85 | 1 | < 0.1% | |
| 18/12/82 | 1 | < 0.1% | |
| 19/06/87 | 1 | < 0.1% | |
| 04/09/87 | 1 | < 0.1% | |
| 19/09/82 | 1 | < 0.1% | |
| Other values (3943) | 3943 | 99.7% |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 8 |
| Min length | 8 |
| Distinct count | 3140 |
|---|---|
| Unique (%) | 81.9% |
| Missing | 118 |
| Missing (%) | 3.0% |
| Memory size | 30.9 KiB |
| Abant Izzet Baysal University | 4 |
|---|---|
| Arab Open University | 4 |
| Phillips Graduate Institute | 4 |
| Universidad de Congreso | 4 |
| Jiangxi University of Traditional Chinese Medicine | 4 |
| Other values (3135) |
| Value | Count | Frequency (%) | |
| Abant Izzet Baysal University | 4 | 0.1% | |
| Arab Open University | 4 | 0.1% | |
| Phillips Graduate Institute | 4 | 0.1% | |
| Universidad de Congreso | 4 | 0.1% | |
| Jiangxi University of Traditional Chinese Medicine | 4 | 0.1% | |
| Tampere Polytechnic | 4 | 0.1% | |
| Fukuoka Institute of Technology | 4 | 0.1% | |
| Carlow College | 4 | 0.1% | |
| Universidad Valle del Momboy | 4 | 0.1% | |
| Christchurch Polytechnic Institute of Technology | 4 | 0.1% | |
| Other values (3130) | 3795 | 96.0% | |
| (Missing) | 118 | 3.0% |
Length
| Max length | 114 |
|---|---|
| Median length | 28 |
| Mean length | 29.67088287 |
| Min length | 3 |
| Distinct count | 434 |
|---|---|
| Unique (%) | 11.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13017.499367568935 |
|---|---|
| Minimum | 1000 |
| Maximum | 35000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 30.9 KiB |
Quantile statistics
| Minimum | 1000 |
|---|---|
| 5-th percentile | 3000 |
| Q1 | 6500 |
| median | 12000 |
| Q3 | 17625 |
| 95-th percentile | 30000 |
| Maximum | 35000 |
| Range | 34000 |
| Interquartile range (IQR) | 11125 |
Descriptive statistics
| Standard deviation | 8155.330342 |
|---|---|
| Coefficient of variation (CV) | 0.6264897821 |
| Kurtosis | 0.3258532123 |
| Mean | 13017.49937 |
| Median Absolute Deviation (MAD) | 5500 |
| Skewness | 0.9233128761 |
| Sum | 51458175 |
| Variance | 66509412.98 |
| Value | Count | Frequency (%) | |
| 12000 | 315 | 8.0% | |
| 10000 | 259 | 6.6% | |
| 15000 | 190 | 4.8% | |
| 20000 | 174 | 4.4% | |
| 6000 | 165 | 4.2% | |
| 5000 | 153 | 3.9% | |
| 35000 | 143 | 3.6% | |
| 8000 | 124 | 3.1% | |
| 16000 | 99 | 2.5% | |
| 25000 | 97 | 2.5% | |
| Other values (424) | 2234 | 56.5% |
| Value | Count | Frequency (%) | |
| 1000 | 21 | 0.5% | |
| 1100 | 1 | < 0.1% | |
| 1200 | 9 | 0.2% | |
| 1300 | 2 | 0.1% | |
| 1325 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 35000 | 143 | 3.6% | |
| 34475 | 1 | < 0.1% | |
| 34000 | 2 | 0.1% | |
| 33950 | 1 | < 0.1% | |
| 33600 | 2 | 0.1% |
| Distinct count | 828 |
|---|---|
| Unique (%) | 20.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 12809.792160966355 |
|---|---|
| Minimum | 750.0 |
| Maximum | 35000.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 30.9 KiB |
Quantile statistics
| Minimum | 750 |
|---|---|
| 5-th percentile | 3000 |
| Q1 | 6500 |
| median | 11775 |
| Q3 | 17000 |
| 95-th percentile | 29735 |
| Maximum | 35000 |
| Range | 34250 |
| Interquartile range (IQR) | 10500 |
Descriptive statistics
| Standard deviation | 7935.907682 |
|---|---|
| Coefficient of variation (CV) | 0.619518848 |
| Kurtosis | 0.3951370723 |
| Mean | 12809.79216 |
| Median Absolute Deviation (MAD) | 5275 |
| Skewness | 0.9263171893 |
| Sum | 50637108.41 |
| Variance | 62978630.74 |
| Value | Count | Frequency (%) | |
| 12000 | 249 | 6.3% | |
| 10000 | 222 | 5.6% | |
| 6000 | 153 | 3.9% | |
| 5000 | 143 | 3.6% | |
| 15000 | 139 | 3.5% | |
| 8000 | 113 | 2.9% | |
| 7000 | 87 | 2.2% | |
| 3000 | 74 | 1.9% | |
| 20000 | 72 | 1.8% | |
| 14000 | 64 | 1.6% | |
| Other values (818) | 2637 | 66.7% |
| Value | Count | Frequency (%) | |
| 750 | 1 | < 0.1% | |
| 1000 | 20 | 0.5% | |
| 1100 | 1 | < 0.1% | |
| 1200 | 9 | 0.2% | |
| 1300 | 2 | 0.1% |
| Value | Count | Frequency (%) | |
| 35000 | 37 | 0.9% | |
| 34997.35245 | 1 | < 0.1% | |
| 34993.65539 | 1 | < 0.1% | |
| 34987.98452 | 1 | < 0.1% | |
| 34987.27101 | 1 | < 0.1% |
TERM
Categorical
| Distinct count | 2 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 30.9 KiB |
| 36 months | |
|---|---|
| 60 months |
| Value | Count | Frequency (%) | |
| 36 months | 2687 | 68.0% | |
| 60 months | 1266 | 32.0% |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Int Rate
Real number (ℝ≥0)
| Distinct count | 35 |
|---|---|
| Unique (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.1296908676954212 |
|---|---|
| Minimum | 0.06 |
| Maximum | 0.24100000000000002 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 30.9 KiB |
Quantile statistics
| Minimum | 0.06 |
|---|---|
| 5-th percentile | 0.066 |
| Q1 | 0.099 |
| median | 0.127 |
| Q3 | 0.16 |
| 95-th percentile | 0.203 |
| Maximum | 0.241 |
| Range | 0.181 |
| Interquartile range (IQR) | 0.061 |
Descriptive statistics
| Standard deviation | 0.04160931484 |
|---|---|
| Coefficient of variation (CV) | 0.3208345782 |
| Kurtosis | -0.6951924625 |
| Mean | 0.1296908677 |
| Median Absolute Deviation (MAD) | 0.033 |
| Skewness | 0.226416223 |
| Sum | 512.668 |
| Variance | 0.001731335081 |
| Value | Count | Frequency (%) | |
| 0.117 | 324 | 8.2% | |
| 0.127 | 259 | 6.6% | |
| 0.079 | 259 | 6.6% | |
| 0.124 | 254 | 6.4% | |
| 0.135 | 231 | 5.8% | |
| 0.143 | 226 | 5.7% | |
| 0.107 | 213 | 5.4% | |
| 0.099 | 211 | 5.3% | |
| 0.089 | 198 | 5.0% | |
| 0.06 | 160 | 4.0% | |
| Other values (25) | 1618 | 40.9% |
| Value | Count | Frequency (%) | |
| 0.06 | 160 | 4.0% | |
| 0.066 | 156 | 3.9% | |
| 0.075 | 137 | 3.5% | |
| 0.079 | 259 | 6.6% | |
| 0.089 | 198 | 5.0% |
| Value | Count | Frequency (%) | |
| 0.241 | 2 | 0.1% | |
| 0.239 | 6 | 0.2% | |
| 0.235 | 6 | 0.2% | |
| 0.231 | 4 | 0.1% | |
| 0.227 | 6 | 0.2% |
| Distinct count | 1923 |
|---|---|
| Unique (%) | 48.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 375.2073362003542 |
|---|---|
| Minimum | 32.23 |
| Maximum | 1283.5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 30.9 KiB |
Quantile statistics
| Minimum | 32.23 |
|---|---|
| 5-th percentile | 93.88 |
| Q1 | 205.86 |
| median | 336 |
| Q3 | 494.59 |
| 95-th percentile | 813.626 |
| Maximum | 1283.5 |
| Range | 1251.27 |
| Interquartile range (IQR) | 288.73 |
Descriptive statistics
| Standard deviation | 220.261152 |
|---|---|
| Coefficient of variation (CV) | 0.5870385006 |
| Kurtosis | 0.8900854243 |
| Mean | 375.2073362 |
| Median Absolute Deviation (MAD) | 140.06 |
| Skewness | 0.9837168213 |
| Sum | 1483194.6 |
| Variance | 48514.9751 |
| Value | Count | Frequency (%) | |
| 330.76 | 27 | 0.7% | |
| 396.92 | 25 | 0.6% | |
| 325.74 | 22 | 0.6% | |
| 386.7 | 21 | 0.5% | |
| 339.31 | 20 | 0.5% | |
| 322.25 | 19 | 0.5% | |
| 334.16 | 19 | 0.5% | |
| 343.09 | 18 | 0.5% | |
| 190.52 | 18 | 0.5% | |
| 368.45 | 17 | 0.4% | |
| Other values (1913) | 3747 | 94.8% |
| Value | Count | Frequency (%) | |
| 32.23 | 1 | < 0.1% | |
| 32.58 | 2 | 0.1% | |
| 33.08 | 2 | 0.1% | |
| 33.55 | 1 | < 0.1% | |
| 33.94 | 3 | 0.1% |
| Value | Count | Frequency (%) | |
| 1283.5 | 1 | < 0.1% | |
| 1276.6 | 1 | < 0.1% | |
| 1269.73 | 1 | < 0.1% | |
| 1243.85 | 1 | < 0.1% | |
| 1222.03 | 1 | < 0.1% |
| Distinct count | 7 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 30.9 KiB |
| B | |
|---|---|
| A | |
| C | |
| D | |
| E | |
| Other values (2) | 149 |
| Value | Count | Frequency (%) | |
| B | 1262 | 31.9% | |
| A | 908 | 23.0% | |
| C | 811 | 20.5% | |
| D | 510 | 12.9% | |
| E | 313 | 7.9% | |
| F | 125 | 3.2% | |
| G | 24 | 0.6% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
| Distinct count | 35 |
|---|---|
| Unique (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 30.9 KiB |
| B3 | 324 |
|---|---|
| B5 | 260 |
| A4 | 259 |
| B4 | 254 |
| C1 | 231 |
| Other values (30) |
| Value | Count | Frequency (%) | |
| B3 | 324 | 8.2% | |
| B5 | 260 | 6.6% | |
| A4 | 259 | 6.6% | |
| B4 | 254 | 6.4% | |
| C1 | 231 | 5.8% | |
| C2 | 227 | 5.7% | |
| B2 | 213 | 5.4% | |
| B1 | 211 | 5.3% | |
| A5 | 198 | 5.0% | |
| A1 | 158 | 4.0% | |
| Other values (25) | 1618 | 40.9% |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
Home Ownership
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 30.9 KiB |
| RENT | |
|---|---|
| MORTGAGE | |
| OWN | 295 |
| Value | Count | Frequency (%) | |
| RENT | 2081 | 52.6% | |
| MORTGAGE | 1577 | 39.9% | |
| OWN | 295 | 7.5% |
Length
| Max length | 8 |
|---|---|
| Median length | 4 |
| Mean length | 5.521123198 |
| Min length | 3 |
Annual Inc
Real number (ℝ≥0)
| Distinct count | 813 |
|---|---|
| Unique (%) | 20.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 66175.9735365545 |
|---|---|
| Minimum | 8280.0 |
| Maximum | 550000.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 30.9 KiB |
Quantile statistics
| Minimum | 8280 |
|---|---|
| 5-th percentile | 25000 |
| Q1 | 40100 |
| median | 57000 |
| Q3 | 80000 |
| 95-th percentile | 135880 |
| Maximum | 550000 |
| Range | 541720 |
| Interquartile range (IQR) | 39900 |
Descriptive statistics
| Standard deviation | 40498.80417 |
|---|---|
| Coefficient of variation (CV) | 0.6119865264 |
| Kurtosis | 18.71426089 |
| Mean | 66175.97354 |
| Median Absolute Deviation (MAD) | 18000 |
| Skewness | 3.058200935 |
| Sum | 261593623.4 |
| Variance | 1640153139 |
| Value | Count | Frequency (%) | |
| 60000 | 154 | 3.9% | |
| 50000 | 149 | 3.8% | |
| 75000 | 120 | 3.0% | |
| 40000 | 120 | 3.0% | |
| 45000 | 114 | 2.9% | |
| 70000 | 96 | 2.4% | |
| 80000 | 93 | 2.4% | |
| 30000 | 93 | 2.4% | |
| 65000 | 88 | 2.2% | |
| 35000 | 82 | 2.1% | |
| Other values (803) | 2844 | 71.9% |
| Value | Count | Frequency (%) | |
| 8280 | 1 | < 0.1% | |
| 8400 | 1 | < 0.1% | |
| 9600 | 1 | < 0.1% | |
| 9960 | 1 | < 0.1% | |
| 10000 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 550000 | 1 | < 0.1% | |
| 525000 | 1 | < 0.1% | |
| 408000 | 1 | < 0.1% | |
| 400000 | 2 | 0.1% | |
| 365000 | 1 | < 0.1% |
Verification Status
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 30.9 KiB |
| Verified | |
|---|---|
| Not Verified | |
| Source Verified |
| Value | Count | Frequency (%) | |
| Verified | 1515 | 38.3% | |
| Not Verified | 1247 | 31.5% | |
| Source Verified | 1191 | 30.1% |
Length
| Max length | 15 |
|---|---|
| Median length | 12 |
| Mean length | 11.37085758 |
| Min length | 8 |
Loan Writeoff
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 30.9 KiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 3275 | 82.8% | |
| 1 | 678 | 17.2% |
PURPOSE
Categorical
| Distinct count | 13 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 30.9 KiB |
| debt_consolidation | |
|---|---|
| credit_card | |
| other | 297 |
| home_improvement | 196 |
| small_business | 145 |
| Other values (8) |
| Value | Count | Frequency (%) | |
| debt_consolidation | 2102 | 53.2% | |
| credit_card | 792 | 20.0% | |
| other | 297 | 7.5% | |
| home_improvement | 196 | 5.0% | |
| small_business | 145 | 3.7% | |
| major_purchase | 100 | 2.5% | |
| car | 90 | 2.3% | |
| wedding | 63 | 1.6% | |
| medical | 52 | 1.3% | |
| moving | 39 | 1.0% | |
| Other values (3) | 77 | 1.9% |
Length
| Max length | 18 |
|---|---|
| Median length | 18 |
| Mean length | 14.28307614 |
| Min length | 3 |
| Distinct count | 615 |
|---|---|
| Unique (%) | 15.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 30.9 KiB |
| 606xx | 55 |
|---|---|
| 900xx | 55 |
| 100xx | 54 |
| 112xx | 50 |
| 945xx | 49 |
| Other values (610) |
| Value | Count | Frequency (%) | |
| 606xx | 55 | 1.4% | |
| 900xx | 55 | 1.4% | |
| 100xx | 54 | 1.4% | |
| 112xx | 50 | 1.3% | |
| 945xx | 49 | 1.2% | |
| 070xx | 45 | 1.1% | |
| 331xx | 44 | 1.1% | |
| 750xx | 41 | 1.0% | |
| 300xx | 41 | 1.0% | |
| 113xx | 40 | 1.0% | |
| Other values (605) | 3479 | 88.0% |
Length
| Max length | 5 |
|---|---|
| Median length | 5 |
| Mean length | 5 |
| Min length | 5 |
Add State
Categorical
| Distinct count | 43 |
|---|---|
| Unique (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 30.9 KiB |
| CA | |
|---|---|
| NY | 372 |
| FL | 304 |
| TX | 273 |
| NJ | 181 |
| Other values (38) |
| Value | Count | Frequency (%) | |
| CA | 729 | 18.4% | |
| NY | 372 | 9.4% | |
| FL | 304 | 7.7% | |
| TX | 273 | 6.9% | |
| NJ | 181 | 4.6% | |
| IL | 155 | 3.9% | |
| GA | 146 | 3.7% | |
| PA | 136 | 3.4% | |
| VA | 130 | 3.3% | |
| OH | 124 | 3.1% | |
| Other values (33) | 1403 | 35.5% |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
DTI
Real number (ℝ≥0)
| Distinct count | 1961 |
|---|---|
| Unique (%) | 49.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 14.428287376675943 |
|---|---|
| Minimum | 0.0 |
| Maximum | 29.85 |
| Zeros | 3 |
| Zeros (%) | 0.1% |
| Memory size | 30.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 3.932 |
| Q1 | 9.58 |
| median | 14.45 |
| Q3 | 19.47 |
| 95-th percentile | 24.214 |
| Maximum | 29.85 |
| Range | 29.85 |
| Interquartile range (IQR) | 9.89 |
Descriptive statistics
| Standard deviation | 6.378445753 |
|---|---|
| Coefficient of variation (CV) | 0.4420792008 |
| Kurtosis | -0.7703420751 |
| Mean | 14.42828738 |
| Median Absolute Deviation (MAD) | 4.94 |
| Skewness | -0.04903565752 |
| Sum | 57035.02 |
| Variance | 40.68457022 |
| Value | Count | Frequency (%) | |
| 11.8 | 9 | 0.2% | |
| 18.63 | 8 | 0.2% | |
| 20.88 | 8 | 0.2% | |
| 9.65 | 7 | 0.2% | |
| 12.48 | 7 | 0.2% | |
| 18.84 | 7 | 0.2% | |
| 17.67 | 7 | 0.2% | |
| 16.4 | 7 | 0.2% | |
| 19.63 | 7 | 0.2% | |
| 16.2 | 7 | 0.2% | |
| Other values (1951) | 3879 | 98.1% |
| Value | Count | Frequency (%) | |
| 0 | 3 | 0.1% | |
| 0.02 | 2 | 0.1% | |
| 0.07 | 1 | < 0.1% | |
| 0.2 | 1 | < 0.1% | |
| 0.25 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 29.85 | 1 | < 0.1% | |
| 29.83 | 1 | < 0.1% | |
| 29.73 | 1 | < 0.1% | |
| 29.72 | 1 | < 0.1% | |
| 29.63 | 1 | < 0.1% |
| Distinct count | 6 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.10852517075638755 |
|---|---|
| Minimum | 0 |
| Maximum | 6 |
| Zeros | 3628 |
| Zeros (%) | 91.8% |
| Memory size | 30.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 6 |
| Range | 6 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.4087983222 |
|---|---|
| Coefficient of variation (CV) | 3.766852606 |
| Kurtosis | 32.99870086 |
| Mean | 0.1085251708 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 4.954297207 |
| Sum | 429 |
| Variance | 0.1671160683 |
| Value | Count | Frequency (%) | |
| 0 | 3628 | 91.8% | |
| 1 | 246 | 6.2% | |
| 2 | 61 | 1.5% | |
| 3 | 13 | 0.3% | |
| 4 | 4 | 0.1% | |
| 6 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 3628 | 91.8% | |
| 1 | 246 | 6.2% | |
| 2 | 61 | 1.5% | |
| 3 | 13 | 0.3% | |
| 4 | 4 | 0.1% |
| Value | Count | Frequency (%) | |
| 6 | 1 | < 0.1% | |
| 4 | 4 | 0.1% | |
| 3 | 13 | 0.3% | |
| 2 | 61 | 1.5% | |
| 1 | 246 | 6.2% |
| Distinct count | 9 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.8555527447508222 |
|---|---|
| Minimum | 0 |
| Maximum | 8 |
| Zeros | 1822 |
| Zeros (%) | 46.1% |
| Memory size | 30.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 3 |
| Maximum | 8 |
| Range | 8 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.997025005 |
|---|---|
| Coefficient of variation (CV) | 1.165357731 |
| Kurtosis | 2.163689287 |
| Mean | 0.8555527448 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 1.26526022 |
| Sum | 3382 |
| Variance | 0.9940588606 |
| Value | Count | Frequency (%) | |
| 0 | 1822 | 46.1% | |
| 1 | 1245 | 31.5% | |
| 2 | 584 | 14.8% | |
| 3 | 265 | 6.7% | |
| 4 | 21 | 0.5% | |
| 5 | 10 | 0.3% | |
| 6 | 3 | 0.1% | |
| 7 | 2 | 0.1% | |
| 8 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 1822 | 46.1% | |
| 1 | 1245 | 31.5% | |
| 2 | 584 | 14.8% | |
| 3 | 265 | 6.7% | |
| 4 | 21 | 0.5% |
| Value | Count | Frequency (%) | |
| 8 | 1 | < 0.1% | |
| 7 | 2 | 0.1% | |
| 6 | 3 | 0.1% | |
| 5 | 10 | 0.3% | |
| 4 | 21 | 0.5% |
Pub Rec
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 30.9 KiB |
| 0 | |
|---|---|
| 1 | 120 |
| 2 | 2 |
| Value | Count | Frequency (%) | |
| 0 | 3831 | 96.9% | |
| 1 | 120 | 3.0% | |
| 2 | 2 | 0.1% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
| Distinct count | 3672 |
|---|---|
| Unique (%) | 92.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 14367.447508221603 |
|---|---|
| Minimum | 0 |
| Maximum | 140967 |
| Zeros | 42 |
| Zeros (%) | 1.1% |
| Memory size | 30.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1240.4 |
| Q1 | 6352 |
| median | 11449 |
| Q3 | 18151 |
| 95-th percentile | 35148.4 |
| Maximum | 140967 |
| Range | 140967 |
| Interquartile range (IQR) | 11799 |
Descriptive statistics
| Standard deviation | 13468.63453 |
|---|---|
| Coefficient of variation (CV) | 0.937441012 |
| Kurtosis | 18.01764983 |
| Mean | 14367.44751 |
| Median Absolute Deviation (MAD) | 5657 |
| Skewness | 3.322035836 |
| Sum | 56794520 |
| Variance | 181404116.1 |
| Value | Count | Frequency (%) | |
| 0 | 42 | 1.1% | |
| 8032 | 3 | 0.1% | |
| 6314 | 3 | 0.1% | |
| 14848 | 3 | 0.1% | |
| 10980 | 3 | 0.1% | |
| 11338 | 3 | 0.1% | |
| 15183 | 3 | 0.1% | |
| 8357 | 3 | 0.1% | |
| 6565 | 3 | 0.1% | |
| 13034 | 3 | 0.1% | |
| Other values (3662) | 3884 | 98.3% |
| Value | Count | Frequency (%) | |
| 0 | 42 | 1.1% | |
| 3 | 1 | < 0.1% | |
| 6 | 1 | < 0.1% | |
| 8 | 1 | < 0.1% | |
| 16 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 140967 | 1 | < 0.1% | |
| 131949 | 1 | < 0.1% | |
| 130920 | 1 | < 0.1% | |
| 124744 | 1 | < 0.1% | |
| 123416 | 1 | < 0.1% |
Total Paymnt
Real number (ℝ≥0)
| Distinct count | 3710 |
|---|---|
| Unique (%) | 93.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 14435.064318165443 |
|---|---|
| Minimum | 0.0 |
| Maximum | 58886.47343 |
| Zeros | 2 |
| Zeros (%) | 0.1% |
| Memory size | 30.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2401.064047 |
| Q1 | 6614.78722 |
| median | 11907.35 |
| Q3 | 19190.68001 |
| 95-th percentile | 35788.92425 |
| Maximum | 58886.47343 |
| Range | 58886.47343 |
| Interquartile range (IQR) | 12575.89279 |
Descriptive statistics
| Standard deviation | 10492.53033 |
|---|---|
| Coefficient of variation (CV) | 0.7268779753 |
| Kurtosis | 1.593830926 |
| Mean | 14435.06432 |
| Median Absolute Deviation (MAD) | 5937.176941 |
| Skewness | 1.261678967 |
| Sum | 57061809.25 |
| Variance | 110093192.6 |
| Value | Count | Frequency (%) | |
| 14288.76169 | 8 | 0.2% | |
| 13148.13786 | 7 | 0.2% | |
| 11907.34732 | 7 | 0.2% | |
| 12029.45 | 7 | 0.2% | |
| 11600.98 | 6 | 0.2% | |
| 14288.77 | 5 | 0.1% | |
| 11726.32 | 5 | 0.1% | |
| 10956.77596 | 5 | 0.1% | |
| 9011.557494 | 5 | 0.1% | |
| 13263.96 | 5 | 0.1% | |
| Other values (3700) | 3893 | 98.5% |
| Value | Count | Frequency (%) | |
| 0 | 2 | 0.1% | |
| 91.39 | 1 | < 0.1% | |
| 151.8 | 1 | < 0.1% | |
| 165.37 | 1 | < 0.1% | |
| 203.55 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 58886.47343 | 1 | < 0.1% | |
| 58133.3199 | 1 | < 0.1% | |
| 58090.95207 | 1 | < 0.1% | |
| 58071.19982 | 1 | < 0.1% | |
| 58071.19977 | 1 | < 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| Name | Email ID | Gender | Dt_Applied | University | Loan Amnt | Funded amnt inv | TERM | Int Rate | INSTALLMENT | GRADE | Sub Grade | Home Ownership | Annual Inc | Verification Status | Loan Writeoff | PURPOSE | Zip Code | Add State | DTI | Delinq 2Yrs | Inq Last 6Mths | Pub Rec | Revol Bal | Total Paymnt | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Calley Giron | cgiron0@ehow.com | Female | 01/01/81 | Warner Southern College | 5000 | 4975.0 | 36 months | 0.107 | 162.87 | B | B2 | RENT | 24000.0 | Verified | 0 | credit_card | 860xx | AZ | 27.65 | 0 | 1 | 0 | 13648 | 5863.155187 |
| 1 | Linus Stud | lstud1@washington.edu | Male | 02/01/81 | Shri Lal Bahadur Shastri Rashtriya Sanskrit Vidyapeetha | 2500 | 2500.0 | 60 months | 0.153 | 59.83 | C | C4 | RENT | 30000.0 | Source Verified | 1 | car | 309xx | GA | 1.00 | 0 | 5 | 0 | 1687 | 1014.530000 |
| 2 | Lorelle Ambage | lambage2@wix.com | Female | 03/01/81 | Technische Universität Bergakademie Freiberg | 2400 | 2400.0 | 36 months | 0.160 | 84.33 | C | C5 | RENT | 12252.0 | Not Verified | 0 | small_business | 606xx | IL | 8.72 | 0 | 2 | 0 | 2956 | 3005.666844 |
| 3 | Anna-diane Larrat | alarrat3@economist.com | Female | 04/01/81 | Divine Word College of Legazpi | 10000 | 10000.0 | 36 months | 0.135 | 339.31 | C | C1 | RENT | 49200.0 | Source Verified | 0 | other | 917xx | CA | 20.00 | 0 | 1 | 0 | 5598 | 12231.890000 |
| 4 | Gill Ruske | NaN | Female | 05/01/81 | East China Jiao Tong University | 3000 | 3000.0 | 60 months | 0.127 | 67.79 | B | B5 | RENT | 80000.0 | Source Verified | 0 | other | 972xx | OR | 17.94 | 0 | 0 | 0 | 27783 | 4066.908161 |
| 5 | Evelyn MacFaul | emacfaul5@theatlantic.com | Female | 06/01/81 | Ahmedabad University | 5000 | 5000.0 | 36 months | 0.079 | 156.46 | A | A4 | RENT | 36000.0 | Source Verified | 0 | wedding | 852xx | AZ | 11.20 | 0 | 3 | 0 | 7963 | 5632.210000 |
| 6 | Ainslie Rainard | arainard6@virginia.edu | Female | 07/01/81 | NaN | 7000 | 7000.0 | 60 months | 0.160 | 170.08 | C | C5 | RENT | 47004.0 | Not Verified | 0 | debt_consolidation | 280xx | NC | 23.51 | 0 | 1 | 0 | 17726 | 10137.840010 |
| 7 | Emmott Hamby | ehamby7@prnewswire.com | Male | 08/01/81 | Institute of Business Management | 3000 | 3000.0 | 36 months | 0.186 | 109.43 | E | E1 | RENT | 48000.0 | Source Verified | 0 | car | 900xx | CA | 5.35 | 0 | 2 | 0 | 8221 | 3939.135294 |
| 8 | Shem Toomer | stoomer8@home.pl | Male | 09/01/81 | Osaka University of Education | 5600 | 5600.0 | 60 months | 0.213 | 152.39 | F | F2 | OWN | 40000.0 | Source Verified | 1 | small_business | 958xx | CA | 5.55 | 0 | 2 | 0 | 5210 | 647.500000 |
| 9 | Giana Aberhart | gaberhart9@mozilla.com | Female | 10/01/81 | American Public University | 5375 | 5350.0 | 60 months | 0.127 | 121.45 | B | B5 | RENT | 15000.0 | Verified | 1 | other | 774xx | TX | 18.08 | 0 | 0 | 0 | 9279 | 1484.590000 |
Last rows
| Name | Email ID | Gender | Dt_Applied | University | Loan Amnt | Funded amnt inv | TERM | Int Rate | INSTALLMENT | GRADE | Sub Grade | Home Ownership | Annual Inc | Verification Status | Loan Writeoff | PURPOSE | Zip Code | Add State | DTI | Delinq 2Yrs | Inq Last 6Mths | Pub Rec | Revol Bal | Total Paymnt | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3943 | Merla Thebe | mthebeq7@cocolog-nifty.com | Female | 21/10/91 | North Eastern Hill University | 6000 | 6000.0 | 36 months | 0.163 | 211.81 | D | D1 | RENT | 39564.0 | Verified | 1 | debt_consolidation | 606xx | IL | 23.78 | 2 | 1 | 0 | 2028 | 3388.960000 |
| 3944 | Marcellina Dinneges | mdinnegesq8@infoseek.co.jp | Female | 22/10/91 | Universidade Católica de Santos | 2400 | 2400.0 | 36 months | 0.117 | 79.39 | B | B3 | RENT | 39800.0 | Not Verified | 0 | other | 303xx | GA | 14.32 | 0 | 0 | 0 | 15497 | 2836.660516 |
| 3945 | Way Symonds | wsymondsq9@mlb.com | Male | 23/10/91 | American International University West Africa | 25000 | 25000.0 | 60 months | 0.183 | 638.25 | D | D5 | MORTGAGE | 156000.0 | Source Verified | 0 | house | 944xx | CA | 5.85 | 0 | 0 | 0 | 10709 | 37936.750000 |
| 3946 | Ailene Matejka | NaN | Female | 24/10/91 | Kaya University | 20000 | 20000.0 | 36 months | 0.117 | 661.52 | B | B3 | RENT | 80700.0 | Verified | 0 | debt_consolidation | 946xx | CA | 13.67 | 0 | 1 | 0 | 7211 | 23406.523000 |
| 3947 | Samuel Overel | NaN | Male | 25/10/91 | Northwestern University | 12000 | 12000.0 | 60 months | 0.183 | 306.36 | D | D5 | MORTGAGE | 34000.0 | Not Verified | 1 | debt_consolidation | 177xx | PA | 12.56 | 0 | 0 | 0 | 6114 | 9667.950000 |
| 3948 | Corbie Creeboe | ccreeboeqc@sitemeter.com | Male | 26/10/91 | Shaheed Rajaei Teacher Training University | 12000 | 12000.0 | 36 months | 0.135 | 407.17 | C | C1 | RENT | 125000.0 | Source Verified | 0 | wedding | 086xx | NJ | 13.18 | 0 | 1 | 0 | 46286 | 14657.917650 |
| 3949 | Bobbe Ochterlonie | bochterlonieqd@ezinearticles.com | Female | 27/10/91 | Dhofar University | 15000 | 15000.0 | 36 months | 0.124 | 501.23 | B | B4 | RENT | 72000.0 | Verified | 0 | debt_consolidation | 104xx | NY | 7.47 | 0 | 1 | 0 | 12147 | 16729.253640 |
| 3950 | Corella Esposito | cespositoqe@macromedia.com | Female | 28/10/91 | University of Jan Evangelista Purkyne | 12000 | 12000.0 | 36 months | 0.060 | 365.23 | A | A1 | OWN | 48000.0 | Not Verified | 0 | debt_consolidation | 365xx | AL | 23.35 | 0 | 0 | 0 | 22385 | 13148.137860 |
| 3951 | Prince Dibdin | pdibdinqf@businessinsider.com | Male | 29/10/91 | College in Sládkovičovo | 15000 | 15000.0 | 60 months | 0.160 | 364.46 | C | C5 | RENT | 50000.0 | Verified | 1 | debt_consolidation | 907xx | CA | 18.26 | 0 | 1 | 0 | 9799 | 10883.540000 |
| 3952 | Georgette Warratt | gwarrattqg@java.com | Female | 30/10/91 | Technical University of Lublin | 15000 | 14975.0 | 60 months | 0.153 | 358.98 | C | C4 | MORTGAGE | 32976.0 | Not Verified | 1 | debt_consolidation | 177xx | PA | 17.90 | 0 | 1 | 0 | 7956 | 11704.260000 |